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Abstract 



We present a real-space formulation for coarse-graining Kohn-Sham Density Functional Theory that signif- 
icantly speeds up the analysis of material defects without appreciable loss of accuracy. The approximation 
scheme consists of two steps. First, we develop a linear-scaling method that enables the direct evalua- 
tion of the electron density without the need to evaluate individual orbitals. We achieve this by performing 
i-^ \ Gauss quadrature over the spectrum of the linearized Hamiltonian operator appearing in each iteration of the 

self-consistent field method. Building on the linear-scaling method, we introduce a spatial approximation 
scheme resulting in a coarse-grained Density Functional Theory. The spatial approximation is adapted so as 
to furnish fine resolution where necessary and to coarsen elsewhere. This coarse-graining step enables the 
analysis of defects at a fraction of the original computational cost, without any significant loss of accuracy. 
Furthermore, we show that the coarse-grained solutions are convergent with respect to the spatial approx- 
imation. We illustrate the scope, versatility, efficiency and accuracy of the scheme by means of selected 
c/3 1 examples. 

^ . Key words: Kohn-Sham; Density functional theory; Coarse-graining; Defects; Linear-scaling; Gauss 



quadrature 



1. Introduction 

Crystal defects play a critical role in determining macroscopic properties of solids even when present 
in small concentrations (Phillips, 2001). For instance, vacancies, perhaps the simplest type of defect, are 
fundamental to phenomena like creep, spall and radiation ageing even when present in only parts per mil- 
lion. Dislocations enable plasticity even though their densities are typically as small as 10 -8 per atomic row. 
Stacking faults also influence plasticity and surface energies affect fracture. Further, small changes in com- 
position often have a dramatic effect on mechanical properties because they segregate to defects. The reason 
that defects play such a profound role is that they bring together the chemical effects of the core, the discrete 
effects of the lattice and the long-range effects of the elastic fields. Consider specifically a vacancy. On the 
one hand, the nature of the core is determined by the chemistry of the dangling bonds. On the other hand, 
the vacancy gives rise to displacements of atoms that decay at a polynomial rate, and this enables long-range 
interactions with other defects as well as macroscopicically applied fields. This coupling grows stronger in 
extended defects. Thus, a complete understanding of defects at physically relevant concentrations requires 
a simultaneous study of the details of the core as well as the long-range elastic fields. 

This remains an outstanding challenge. On the one hand, methods like Density Functional Theory 
(DFT) that are capable of accurately describing the chemistry of the defect core are much too complicated 
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and expensive to use for defects with long-range fields. On the other hand, methods capable of describing 
the long-range fields including atomistic and continuum methods, lack the fidelity and rely on empiricism 
to describe the core. This challenge has prompted the development of multiscale approaches which seek to 
use DFT or tight-binding (TB) for the core, atomistic models for the near field and continuum for the far- 
field (e.g., Govind et al. (1999); Choly et al. (2005); Lu et al. (2006); Bernstein et al. (2009)). Others have 
used parameter passing where detailed simulations are used to fit the parameters of a coarser model (e.g., 
Cuitino et al. (2002)). While these multiscale methods provide valuable insight, there is no seamless tran- 
sition from DFT to TB or empirical potentials to continuum. Uncontrolled approximations resulting from 
these seams, the use of linear-response theory and kinematic assumptions like the Cauchy-Born hypothesis 
may render the methods unreliable. Furthermore, there is often no guarantee of convergence of the solutions 
provided by these methods to the full DFT solution. 

An alternative approach to multiscale modeling was initiated through the quasicontinuum method (Tadmor et al., 
1996). Here, one seeks to solve a single theory in the complete region, but using an adaptive numerical 
method where all the details are represented close to the core but only sampled farther away. There is no 
additional modeling (uncontrolled approximations), and all the approximations are in the numerical scheme 
so that it converges to the detailed theory with increased resolution. Tadmor et al. (1996) and subsequent 
refinements (see Miller and Tadmor (2009) for a recent review) use an atomistic approach with empirical 
potentials as the only theory. To that extent, it is unable to describe the full details of the defect core. 

The Schrodinger equation is fundamental for describing the quantum mechanical electronic structure 
of matter. However, the solution of the Schrodinger equation is exceedingly expensive (scaling exponen- 
tially with number of atoms) and therefore impractical. A far-reaching reformulation of the problem was 
achieved by Hohenberg and Kohn (1964), who proved the existence of a one-to-one correspondence be- 
tween the ground-state electron density and the ground-state wavefunction of a many-particle system. By 
this correspondence, the electron density replaces the many-body electronic wavefunction as the funda- 
mental unknown field, thereby greatly reducing the dimensionality and computational complexity of the 
problem. However, this theory requires an unknown but universal (independent of material) exchange 
and conelation functional. Computationally convenient models have been developed, including the local 
density approximation (LDA) (Kohn and Sham, 1965) and the generalized gradient approximation (GGA) 
(Langreth and Mehl, 1983; Perdew et al., 1992). 

Traditional formulations of DFT follow Kohn and Sham (1965) and solve for the orbitals (Chelikowsky et al., 
1994; Kresse and Furthmuller, 1996; Pasketal., 1999; Ismail-Beigi and Arias, 2000; Segalletal., 2002; 
Gonze et al., 2002; Tsuchida, 2004; Castro et al., 2006; Suryanarayana et al., 2010, 2011). Since the num- 
ber of orbitals is proportional to the number of atoms and they have to be mutually orthogonal, these formu- 
lations typically result in cubic-scaling with respect to the number of atoms. This places severe restrictions 
on the size of the systems which can be studied. To overcome this, a number of methods have been pro- 
posed that exhibit better scaling properties, with particular emphasis on linear-scaling (Galli and Parrinello, 
1992; Mauri et al, 1993; Goedecker, 1999; Skylaris et al., 2005; Barrault et al., 2007; Garcia-Cervera et al., 
2009; Bowler and Miyazaki, 2012). See Goedecker (1999) and Bowler and Miyazaki (2012) for exhaus- 
tive reviews of these methods. Briefly, a vast majority of linear scaling methods avoid orbitals and instead 
reformulate DFT in terms of the so-called density matrix. This is a diagonally dominant matrix and linear 
scaling is obtained by cutting off the decaying off-diagonal components. Often a basis set made of local- 
ized Wannier functions is used to represent the density matrix, and the decay of off-diagonal components is 
equivalent to the decay of these functions. The cutoff is acceptable in insulators where the band gap results 
in a strong exponential decay. In metals however, one only has polynomial decay in general. While, it 
has been shown that exponential decay may exist in idealized examples at finite temperatures (Goedecker, 
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1999), a mathematical understanding of the decay properties and consequently the accuracy of these meth- 
ods remain unclear. Therefore, the development of a linear-scaling method for metallic systems remains an 
open problem (Cances et al., 2008). 

Even with a linear scaling DFT method, the study of crystal defects at realistic concentrations remains 
a daunting problem. Gavini et al. (2007) extended the quasicontinuum approach to electronic structure by 
implementing it on a simplified version of DFT known as orbital-free DFT (OFDFT). OFDFT models the 
kinetic energy of the electrons instead of computing it explicitly using wave functions, and is limited in its 
applicability to free electron metals like Aluminum. Gavini et al. (2007) use a nested discretization to fully 
resolve both the atomic displacements and electronic density close to the core, but sample it elsewhere. They 
demonstrated the efficacy of their method by exhibiting computational savings of a factor of 10 3 or more, 
and also showed through examples how the physics can change depending on the size of the computational 
cell (concentration of defects). Unfortunately, it is not possible to directly extend their idea to the standard 
formulation of DFT for three main reasons. First, orbitals are subject to orthogonality, which is a global 
constraint and difficult to coarse-grain. From a practical point of view, orthogonality means that orbitals 
oscillate extremely rapidly and this makes the quasicontinuum representation difficult. Second, they are not 
periodic even in homogenous systems (instead, they are Bloch-Floquet waves). Finally, one has as many 
orbitals as electrons, and thus the method would scale poorly even with a quasicontinuum representation of 
each orbital. 

In this paper, we present a method that overcomes these difficulties. Specifically, we present a for- 
mulation (which we call CGDFT) that seamlessly coarse-grains DFT by recourse to controlled numerical 
approximations without the introduction of new or spurious physics. We accomplish this through a series of 
steps. First, we reformulate DFT in such a manner that eliminates the need to explicitly compute orbitals. 
Instead of computing the orbitals, and then using them to compute the quantities of interest including Fermi 
energy, electron density and band structure energy, we use integral representations of these quantities in 
terms of the projected density of states. Thus, the theory is formulated in terms of smoothly varying quanti- 
ties that are amenable to coarse-graining. Second, we perform numerical Gauss quadrature on the spectrum 
of the Hamiltonian and use operator theory to evaluate these integrals. Third, we develop an algorithm 
that truly scales linearly with the number of atoms. Together we refer to these steps as the Linear Scaling 
Spectral Gauss Quadrature (LSSGQ) method. As a final step, the spatial approximation is adapted so as 
to furnish fine resolution where necessary and to coarsen elsewhere. This coarse-graining step enables the 
analysis of defects at a fraction of the original computational cost, without any significant loss of accuracy. 
Furthermore, we show that the CGDFT solutions are convergent with respect to the spatial approximation. 
These properties render CGDFT to be highly transferable, efficient and accurate. 

The fundamental difference of the LSSGQ formulation from existing linear-scaling methods is that 
no explicit cutoff is employed. The formulation resembles the Fermi operator expansion (FOE) method 
(Goedecker, 1999; Bowler and Miyazaki, 2012), but there are some notable differences. Importantly, in- 
stead of expanding the density matrix in terms of a polynomial basis or rational functions, we use Gauss 
quadratures to evaluate diagonal elements of the density matrix and other quantities of interest. For this 
reason, LSSGQ does not involve the representation of the complete density matrix. Instead, only the com- 
ponents of interest are extracted. Further, the Fermi energy is not needed as an input to the method. This 
enables significant savings since the method does not have to be iterated over for the calculation of the 
Fermi energy. Additionally, symmetries in the system can be easily utilized to significantly reduce the com- 
putational effort. Finally the proposed method offers adaptivity through controllable accuracy in space, a 
useful feature for multiscale and coarse-graining formulations. The LSSGQ fomulation also has similaries 
to the recursion method used in tight-binding (Haydock et al., 1972, 1975; Haydock, 1980). In the recursion 
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method, a continued fraction representation of the projected density of states is developed and quantities of 
interest are obtained by integration over it. However, we proceed differently and directly calculate the Gauss 
quadrature rules, which is a much more accurate and stable process. 

The remainder of this paper is organized as follows. In Section 2, we provide a brief introduction to 
DFT and in particular to the Kohn-Sham method. This includes a heuristic overview of our reformulation 
in Section 2.2 . Next, we provide some mathematical background to spectral theory and Gauss quadrature 
in Section 3. Subsequently, we present and validate the LSSGQ formulation in Section 4. In Section 5, we 
describe CGDFT, and validate it through select examples. Finally, we conclude in Section 6. 



2. Kohn-Sham Density Functional Theory 

2.1. Orbital formulation 

Consider a system of M atoms with N e electrons. DFT is a theory which can be used to find the ground- 
state energy of the system, the ground-state electron density and the equilibrium position of the nuclei. In 
our presentation, we ignore electron spin and assume that N e is even for clarity, but note that all the methods 
developed in the paper may easily be extended to include spin. Let R = {Ri, R2, • • • , Rju} denote the 
positions of the nuclei with charges {Zi, Z2, ■ ■ ■ , Zm} respectively. We follow Kohn and Sham (1965) and 
introduce orbitals ^ = {tpi, tp2, ■ ■ ■, ipN e /2 } f° r tne electrons. Then, the energy of the system is written as 



8 (tf , R) = - V / ^(x)V 2 i(x) dx + E xc (p) + E u (p) + E cxt (p, R) + E ZZ (R) 



(1) 



where the electron density 

iV e /2 

p(x)=2^ |Vn(x)| 2 . (2) 

n=l 

The first term in Eqn. 1 represents the kinetic energy of the non-interacting electrons. The second term, 
E xc (p), denotes the exchange-correlation energy. For definiteness, we adopt the local density approximation 
(LDA)(Kohn and Sham, 1965) 

E xc = / ptxc(p) dx (3) 



where e xc {p) is the exchange-correlation energy density. The final three terms are electrostatic in nature, 

MP) = \ l I #^dxdx', (4) 
£ cxt (p,R) = / p(x) IV , ~' I dx, (5) 




z 1=1 j=i |±l7 

En is known as the Hartree energy and is the classical electrostatic interaction energy of the electron density, 
E cxt is the electrostatic interaction energy between the electron density and nuclear charges, and E zz is the 
repulsive energy between the nuclei. 
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For a given position of the nuclei R, the electronic ground-state energy of the system is obtained as the 
solution to the variational principle. 

£q(R.) = inf £(^>, R) (7) 

subject to the orthonormality constraints 

( V?(x)^-(x)dx = c%, i,j = 1,2,..., N e /2. (8) 
Jm.3 

It is convenient to rewrite the electrostatic terms by introducing the electrostatic potential <ft as the solu- 
tion to the Poisson equation (cf e.g. Ismail-Beigi and Arias (2000); Suryanarayana et al. (2010)) 

- i- V 2 <£(x, R) = p(x) + 6(x, R) (9) 

where 6(x, R) = Yl j=i kj( x , R-j) denotes the total charge density of the nuclei, with 6j(x, Rj) represent- 
ing the regularized charge density of the J th nucleus. Thereafter, the electrostatic energies may be rewritten 
as 

E R + E cxt + E zz = / |V0(x,R)| 2 dx+ / (p(x)+6(x,R))0(x,R)dx (10) 

Ay f f fcj(x,Rj)bj(x / ,R, / ) dxdx/ 

2 Jul 3 Jul* l x - x 'l 



j=i 

where the last term denotes the self energy of the nuclei. 

The Euler-Lagrange equations of the constrained variational principle (Eqns. 7 and 8) gives rise to the 
nonlinear eigenvalue problem 

U^n = \ n ^n, U = 1, 2, . . . N e /2 (11) 

where the Hamiltonian 

n = -^V 2 + V xc (p) + ^,-R) (12) 

is a self-adjoint operator with eigenvalues A n (ordered so that Ai < A2 < • • • ) and V xc (p) = S ^p^ ■ 
The problem is nonlinear since the Hamiltonian depends on p (through V xc (p) and 0(x, R)) which in turn 
depends on ip n . Therefore, this problem is typically solved by a fixed point iteration with respect to the 
electron density, known as the self-consistent field (SCF) method (cf. e.g. Martin (2004)). In each iteration 
of the SCF method, the electron density is calculated by solving for the eigenfunctions i/; n corresponding to 
the lowest N e /2 eigenvalues A n , and then using Eqn. 2. This is indeed equivalent to the variational problem 
(cf. e.g. Suryanarayana et al. (2010)). 

The electronic ground-state energy is obtained by substituting the solution of Eqn. 11 in Eqn. 1. We 
can then use the eigenvalue problem given by Eqn. 1 1 (left multiply by tjj* and integrate) to show that this 
ground-state energy is 

N e /2 

£ (R) = 2 V X n + E xc (p) + - / (6(x, R) - p(x))(£(x, R) dx 

- / VMM I bj(X ' Rj)bj( f Rj) dxdx' (13) 

JR3 i^^sJus |x-x'| 
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where Uband = 2 Yln=i denotes the band structure energy. 

In this orbital approach, we solve the nonlinear eigenvalue problem (Eqn. 1 1) for the first N e /2 eigenval- 
ues and the corresponding eigenf unctions. Subsequently, the electronic ground-state energy can be evaluated 
using Eqn. 13. This formulation of DFT requires an effort scaling as O(N^) (Goedecker (1999)), which 
restricts the size of systems that can be studied. Further, the orbitals are not amenable to coarse-graining 
since they are extremely oscillatory and are not periodic even in periodic systems. Finally, a quasicontinuum 
representation of each orbital would scale poorly since the number of orbitals increases with the number of 
electrons. 

2.2. Heuristics of the proposed reformulation 

The key idea behind our reformulation of DFT is to note that the ground-state energy (Eqn. 13) does not 
depend on individual eigenvalues and eigenf unctions, but only on certain sums: 

N e /2 iV e /2 

U band = 2 J2 A n , p(x) = 2 lV^n(x)| 2 . (14) 

n=l n=l 

So it seems natural to rewrite these sums by sampling and weighting: 

K K 
Uband = 2^w fc A fc , p(x) = 2^2w k \ip k (x.)\ 2 (15) 

k=l k=l 

for appropriately chosen sampling 'eigenvalues' X k , 'eigenf unctions' ^ and weights w k . We seek to find a 
way of choosing the sampling points optimally and computing the weights efficiently. Particularly, we would 
like K independent of N e and K <C N e /2 (for large iV e ) such that the overall effort in evaluating these sums 
is 0{N e ). Since the number of electrons is proportional to the number of atoms, the computational effort 
will scale as 0(M). 

We accomplish this in four steps. We outline these here, and develop them precisely in Sections 4 after 
recalling some necessary mathematical preliminaries in Section 3. First, we introduce the orbital occupation 
function 

, , fl, if A < X f 
5(A,A/)= ' (16) 
10, otherwise 

and rewrite Ub an d an d p(x) as 

Uband = X n g(Xn,X f ), p(x) = 2 g{X n , X f ) | Vn(x) | 2 . (17) 

n n 

Above Xf = Ajv e / 2 i s ca U e d the Fermi energy. This is unknown a priori but can be solved through the 
constraint 

N e = 2j29iK,X f ). (18) 

n 

Second, we rewrite the sums in Eqns. 17 and 18 as spectral integrals of the form 



/[/] = / /(A)d<7(A) (19) 

Ja(H) 
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where a{H) is the spectrum of the Hamiltonian % and dcr(A) is the appropriate spectral measure. We 
provide an introduction to spectral theory in Section 3.1 and develop integral representations in Section 4.1. 
Third, we evaluate these spectral integrals using Gauss quadratures 

K 

*[/] « Z>k/(A*)- (20) 

k=i 

Finally, we evaluate the quadrature nodes {\k}k=i and weights {wk\k=i using an efficient Lanczos type 
iteration. We provide an introduction to Gauss quadratures and the Lanczos type iteration in Section 3.2 and 
a precise algorithm in Section 4.2. We discuss the scaling and performance of the method in Section 4.3. 

The reformulation as described above overcomes the difficulties encountered when attempting to coarse- 
grain the orbital formulation. Notably, the spectral quadrature points and weights can be formulated as local 
real space variables (at each spatial discretization point), and these are smoothly varying quantities. This 
enables us to coarse-grain the problem in Section 5. 

2.3. Atomic positions 

To calculate the ground-state energy of the system, we need to further minimize the energy with respect 
to the positions of the nuclei. To do so, we take the first variation of £o(R) with respect to R to obtain the 
forces on the nuclei 

fj = ~ [ d^S^ (0(x, R) - 0j(x, Rj)) dx (21) 

J K 3 Oft J 

where 0j(x, Rj) = J R3 fej |^ c _'^j J ' > dx' and fj represents the force on the J th nucleus. In the special case 
where we have bj(x, Rj) = — Zj5(x. — R j), we obtain 



/ J = Z J V(^(x,R)-^ J (x,R J )) 
This result is commonly referred to as the Hellmann-Feynman theorem (cf. e.g. Finnis (2003)). 



(22) 

x=Rj 



2.4. Pseudopotential Approximation 

The core states are localized in the vicinity of the nucleus leading to oscillations of the valence orbitals 
in this region. Irrespective of the basis set used, a large number of basis functions are required to capture 
these oscillations. Additionally, the tightly bound core electrons are chemically inactive and hence have a 
negligible contribution towards determining physical properties. In view of this, the core electrons are elim- 
inated and an effective nuclear potential is introduced to describe the effect of the core electrons, resulting in 
nodeless pseudo-orbitals. This amounts to replacing the all-electron potential 14 xt (x, R) = 2~2iLi | X Z R/| 
with an effective potential (x, R). 

The pseudopotentials can be broadly classified as either local or non-local based on their spatial de- 
pendence. The local pseudopotentials are explicit functions V^f (x, R). They can be incoporated into our 
formulation by letting 6(x, R) = f^V 2 V^ t s . In contrast, non-local pseudopotentials are operators on the 
orbital with angular momentum dependence and are designed to accurately reproduce the scattering proper- 
ties of the all-electron potential. These include norm conserving (Bachelet et al., 1982; Rappe et al., 1990; 
Troullier and Martins, 1991) and ultrasoft pseudopotentials (Vanderbilt, 1990), which are usually employed 
in the Kleinman-Bylander form (Kleinman and Bylander, 1982). For the examples is this paper, we utilize 
a local pseudopotential termed as the 'Evanescent Core' pseudopotential (Fiolhais et al., 1995). However, 
it should be noted that the methods proposed in this work are applicable even with the choice of non-local 
pseduopotentials. 
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2.5. Finite temperature extension 

In previous discussion of the Kohn-Sham method, we have tacitly assumed a temperature of absolute 
zero. The method can be easily extended to finite temperatures (cf. e.g. Pan - and Yang (1989)). The Kohn- 
Sham problem for the electronic ground-state takes the form 

m n = \ n ^ n , ^ = -^V 2 + K C (p) + ^(x,R), 

-— V 2 <Kx,R) = p(x)+fe(x,R), 

p(x) = 2^ 9 (A n ,A / )|^n(x)| 2 , (23) 

n 

N e = 2^<7(A„,A/) 

n 

where the orbital occupation is now allowed to take fractional values as defined by the Fermi-Dirac distri- 
bution 

g(M/)= \ A , A/ - (24) 

1 + exp(^^) 

Above, a = ksd, kg being the Boltzmann constant and 9 is the absolute temperature. The ground-state 
Helmholtz free energy 

T = £ a - OS (25) 

where 

£ a = 2Y^g(X n ,X f )X n + E xc (p) + l [ (6(x, R) - p(x))</>(x, R) dx 

-f ^(pwx)dx-if;/ / b J (x.R J )b j( x-.R J ) dxdx , 

2 ^ J R 3 J R 3 |X - X 

is the finite temperature counterpart of the ground-state energy at absolute zero (£q) and 

5 = -2k B b(An, A/) log 5 (A„, X f ) + (1 - g(X n , A,)) log(l - g(X n ,Xf))} (26) 

is the entropy resulting from the fractional orbital occupations. The ground-state position of the nuclei can 
be found be equilibriating Eqn. 21 (Weinert and Davenport, 1992). Using the finite temperature calculation, 
it is also possible to to get an accurate extrapolation for the ground-state energy at absolute zero (Gillan, 
1989) 

So « I (£* + T) ■ (27) 



3. Mathematical Background 

In this section, we recall a few mathematical tools which we exploit to develop our methods. We refer 
the reader to Rudin (1991) for further details on spectral theory and Golub and Meurant (2010) for Gauss 
quadratures. 
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3.1. Spectral theory 

Let H be a self-adjoint linear operator on a finite-dimensional Hilbert space H with inner product (., .) 
and norm ||.||. Then, according to the spectral theorem, we have a unique resolution of the identity £ that 
satisfies 



H= / Ad£(A) (28) 

Ja(H) 

where a(H) = {X±, A2, • • • , Xjy d } C M is the spectrum of %. Consequently, for any bounded function / on 

<?{%), 



M)= / /(A)df(A). (29) 
J<t(h) 

Note that Eqn. 29 can also be written using Dirac's bra-ket notation as /(%) = En=i /(^n)|V , n)(V'n|> 
where denotes the eigenf unction corresponding to the eigenvalue A n . We assume that the eigenvalues 
are ordered such that Ai < A2 < • • • < Xjv d . 

Let {f]p}p=i be an orthonormal basis of H so that for any ( e H,( = E1S1 (p 7 1p- The eigenfunctions 
can therefore be expressed as ip n = J2p=i ipn&Vp- Thereafter, we have the following representation of the 
measure (A) on a(H) 



£ C)C (A) = (£(A)C,C) 
In the special case of £ = rjk, it reduces to 



0, ifA<A 



En=i Ep=i E^i Mn, q C P ( q , if A m < A < A m+ i • (30) 

I ESl E^l E5l Mn.qtpC* if AjV d < A 



0, if A < Ai 

£%,?? fc ( A ) = "( En=l iV'n.fcl 2 , if A m < A < A m+ l (31) 

.En=!l^| 2 , ifAiv d <A. 



Therefore for any ( E H, 

f(X)d£ cc (X)= [" 

r(W) 

for a = Ai and 6 = A7v d . It is worth noting that d£^^(X) represents the projected density of states of %. 



,0= f /(A) d£ (X (X) = f /(A) d£ u {\) , (32) 



3.2. Gauss quadrature 

Let V be the space of real polynomials and Vk be a subspace of V consisting of polynomials of degree 
K. We define an inner product (relative to the measure S^) of two polynomials p, q € V as 

(p,?>C = [ b p(X)q(X)d£ u (X) (33) 

J a 

with norm 

1 

' p 2 (A)d£ c , c (A)V . (34) 
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Here, we are interested in approximating integrals of the form (Eqn. 32) 



I\f] 



/(A)d£ c , c (A) 



(35) 



using a quadrature rule. To do so, we approximate the function /(A) with an interpolation polynomial 

/(A)-EM)4(A) 



(36) 



fc=i 



where {A^}^ 1 are the nodes/interpolation points and l k (X) is the Lagrange polynomial 



n 



A- A 



\ C _ \ C ' 



(37) 



Thereafter, we obtain the quadrature formula 
where 



A* 



k=l 



W 



C 



z£(A)d5 CiC (A) 



(38) 



(39) 



are the weights of the quadrature rule. In order to obtain the quadrature rule of highest degree, we further 
consider the nodes {X k }£ =l as unknowns. This results in a quadrature rule of degree 2K — 1 i.e. any 
V £ V-iK-i is integrated exactly, commonly referred to as Gauss quadrature. 

To efficiently evaluate the nodes and weights of the Gauss quadrature rule, we generate a sequence 
of orthonormal polynomials (with respect to the measure £^) {p k } k=0 through the three-term recurrence 
relationship 



b k+1 p k+1 (X) = (A - a k+ i)p k {X) - fcfcPfc-i(A) , k = 0, 1, . . . , K - 1 

p_i(A) = 0, po(A) = 1, 6o = l 



(40) 



where 



a k+1 = (Aj5 fc ,p fc ) c , k = 0,1,... ,K - 1 (41) 

and b k is computed such that \\p k \\( = 1 , k = 0, 1, . . . , K. Corresponding to these orthonormal polynomi- 
als, there is a tridiagonal Jacobi matrix Jk of dimension K 

( ai b\ 



Jk 



b\ a 2 



\ 



\ 



bic-2 a,K-i bic-i 

bK-i a K j 



(42) 



Let us denote Pk(X) = (po(A),pi(A), . . . ,pk-i(X)) t . Then the three term recurrence relation given by 
Eqn. 40 can be written compactly as 



XP K (X) = J k Pk{X) + b K p K (X)e K 



(43) 
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where ex is the last column of the identity matrix of dimension K. It follows that the eigenvalues of Jk 
(which are also the zeros of Pk(A)) are the nodes {A/ C }^L 1 of the Gauss quadrature rule and the weights 



4. Linear-Scaling Spectral Gauss Quadrature Method 

In this section, we describe the formulation, implementation and validation of the proposed linear- 
scaling spectral Gauss quadrature (LSSGQ) method. The outline of this section is as follows. We first 
develop integral representations for the quantities of interest in Section 4.1. Subsequently, we describe the 
LSSGQ method in Section 4.2 and discuss its scaling and performance in Section 4.3. Next, we discuss the 
LSSGQ method in the context of the finite-difference approximation in Section 4.4. Finally, we validate the 
LSSGQ method through examples in Section 4.5. The connections of the LSSGQ method with the recursion 
method (Haydock (1980)) and the Pade approximation are discussed in Appendix A. 

4.1. Integral representations 

Consider a spatial discretization of our domain with basis functions. Let H be the (linearized) 
Hamiltonian (Eqn. 12) with a fixed p and <fi. In order to solve the Kohn-Sham problem, we need to evaluate 
the Fermi energy, updated electron density, band structure energy and entropy. We begin by providing an 
integral representation of these quantities over the spectrum of H. 

4.1.1. Fermi energy 

The Fermi energy is calculated by solving for the constraint 



{wk} 



K 

k=l 



are the squares of the first elements of the normalized eigenvectors. 




(44) 



n=l 



Since \\ip. 



it 



Ep=i\^n, P \ 2 = 1, it follows that 



^#(A n ,A/) 



N d N d 



n,p 



2 



n=l 



p=l n=l 




(45) 



Therefore, Eqn. 44 can be rewritten as 




(46) 



11 



4.1.2. Electron density 

The electron density at any point xo is 

N d 

p(x ) = 2^#(A n , A/)|^ n (x )j 2 

n=l 

N d N d N d 

= 2 ^ Yl Yl #( An ' A /)V ; n,p^n, (/ 7?p(xo)?? g (xo) 
n=l p=l q=l 

= 2 [ b g(\,\ f )d£ u (\). (47) 

J a 

where ( = E^i V P {xo)v P - 

4.1.3. Band structure energy 

The band structure energy may be expressed as 

N d 

Uband = 2 A„#(A n , A/) 
n=l 
^d ^d 

= 2^^ A n 5(A„,A / )|V'„,p| 2 

p=l n=l 
N d ,b 

= 2J2 / A^A/Jd^. (48) 
P =i • 7a 

Entropy 
Finally, the entropy follows as 

N d 

S = -2A: B ^[ 5 (A„,A / )log 5 (A ri ,A / ) + (l- 5 (A n ,A / ))log(l- 9 (A n ,A / ))] 

n=l 

^d ^d 

= -2fc B Y E^ A «- A /) log <7(A„, A/) + (1 - 5 (A re , A/)) log(l - 5 (A„, A/))]|-0n, P | 2 

p=l n=l 
^d ,6 

= -2k B Y b(A,A / )log 5 (A,A / ) + (l- 5 (A,A / ))log(l- 9 (A,A / ))]d^ pflp . (49) 
P =i- /a 

4.2. Formulation 

We proceed to describe the proposed LSSGQ method. We solve the nonlinear Kohn-Sham eigenvalue 
problem using the SCF method (cf., e. g., Martin (2004)). Customarily, in each iteration of the SCF method, 
the eigenvalues and eigenfunctions of the Hamiltonian (H) are evaluated for a given electron density. The 
updated electron density is evaluated using Eqn. 2, and the process is repeated until convergence. However, 
we circumvent the evaluation of the eigenfunctions and directly evaluate the electron density by employing 
the procedure described below. 
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We seek to evaluate the integral representations of the Fermi energy and electron density described 
earlier using Guass quadratures. However, only the operator H is known and its resolution of the identity 
£(A) is unknown a priori. In order to overcome this difficulty, we use the spectral theorem to rewrite the 
recurrence relation given by Eqn. (40) as the Lanczos-type iteration 

bk+iVk+i = (U - a k+1 )v k - bkVk-i , k = 0, 1, . . . , K - 1 

w_i = 0, v = (, b = l (50) 

where 

a k+1 = (Uv k ,v k ) , k = 0,1, ...,K-1 (51) 

and b k is computed such that \\v k \\ = 1, k = 0,1, . . . ,K— 1. Note that the initial condition vo = £ is chosen 
depending on the measure Sq ^(A) with respect to which integration needs to be performed. Thereafter, the 
nodes and corresponding weights of the quadrature rule are ascertained following the procedure described 
in Section 3.2. Therefore, we arrive at 

N d fb N d K 

N e = 2^T / g(\,\f)d£ Vp , Tlp (\)n2j2Y, w k9(>%>*f) (52) 

p=l Ja p=l k=l 

r b K 

p(x ) = 2 g(X,X f )d£ cx (X)^2j249^i X f) ^ 
Ja k=i 

where C = E^i V P (^o)v P - 

Once the self-consistent solution of the Kohn-Sham problem for a given position of the nuclei is cal- 
culated, the free energy (Eqn. 25) is calculated by using Gauss quadrature for the evaluation of the band 
structure energy and entropy 



N d ,. b N d K 



band 



2J2[ A 5 (A,A / )d£ 7?p ,, p (A)«2X;X;^A^(A^,A / ), (54) 
p=i Ja p=i k=\ 

N d r b 

S = -2k B Y, [ g (A,A / )log 5 (A,A / ) + (l- 5 (A,A / ))log(l- 5 (A,A / ))]d^ pj7?p (A) 
p=i Ja 

N d K 

p=i fc=i 

To calculate the ground-state free energy, we need to further minimize the free energy with respect to the 
positions of the nuclei. To this end, we equilibrate the forces on the nuclei given by Eqn. 21. We summarize 
the LSSGQ method in Algorithm 1. 



4.3. Scaling and Performance 

In the LSSGQ method, the Fermi energy, electron density, band structure energy and entropy are evalu- 
ated by performing Gauss quadrature on their respective integral representations. The cost of evaluating each 
integral is determined by the calculation of the matrix Jk (Eqn. 42) via the recurrence relation (Eqn. 50) 
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Algorithm 1: LSSGQ method 



Generate guess for positions of the nuclei (R) 
repeat Relaxation of atoms 

Calculate charge density of the nuclei (6) 
Generate guess for electron density (p) 
repeat Self-Consistent loop: SCF 

Calculate electrostatic potential (0) by solving Eqn. 9 
Calculate exchange correlation potential (V xc ) 
Calculate spectral Gauss quadrature nodes and weights 
Calculate Fermi energy (A/) using Eqn. 52 
Calculate electron density (p) using Eqn. 53 
Update the electron density (mixing) 
until Convergence of self-consistent iteration; 
Calculate the forces on the nuclei using Eqn. 21 
until Energy minimized with respect to positions of atoms; 

Evaluate free energy F using Eqns. 54 and 55 for the band structure energy and entropy respectively. 



and the evaluation of its eigenvalues and eigenvectors. The size of Jk in turn is ascertained by the number 
of nodes used for the quadrature (K). We claim that the number of quadrature nodes required for achiev- 
ing a predifined accuracy is independent of system size based on the following argument in the spirit of 
Goedecker (1999). For definiteness, consider the evaluation of the electron density where the function to be 
integrated is g(X, \f). Since there are K quadrature nodes, the resolution offered is roughly proportional to 
1/K . Therefore, we can roughly estimate 

^N d — Ai 

AA ' (56) 

where AA is the width of the region where g(\, Xf) takes fractional values. The ratio in Eqn. 56 is indepen- 
dent of system size, since the number of basis functions required per atom is usually independent of system 
size. Further, if the orthonormal basis used to discretize H is localized, then the matrix representation of 7-L 
is sparse. In such a situation, the evaluation of the quadrature nodes and weights is independent of system 
size. Since we have to evaluate O(Nd) such quadrature rules, and since scales as number of atoms 
M, we conclude that the entire method scales as O(M). This is indeed verified by the results presented in 
Section 4.5. 

To understand the performance of the LSSGQ method, consider any integral of the form given by 
Eqn. 35. The error incurred by using Gauss quadrature is given by 

m = Mc^r (57) 

for some c € [a, b]. It is to be expected that the accuracy of the LSSGQ method improves with increasing 
the temperature 9. Further, the LSSGQ method is also expected to have spectral convergence with respect 
to the number of quadrature points (K). These properties are verified through the test cases presented in 
Section 4.5. 

In this work, we have developed quadrature rules of degree 2K — 1 by treating both the nodes and 
weights as unknowns, thus performing Gauss quadrature. It is also possible to fix the positions of the nodes 
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and treat only the weights as unknowns. However, this will result in a quadrature rule of degree K — 1. 
Therefore, our choice of Gauss quadrature is optimal in the sense that for a given number of nodes, it has 
the highest degree quadrature rule. Also, in the proposed approach, the Fermi energy is not needed as an 
input, which highlights the efficiency of the method. 

4.4. Finite-difference approximation 

The method developed above is general and can be implemented with any localized orthogonal basis. We 
now briefly describe the implementation with the finite-difference approximation. This provides a concrete 
example of the method, and also prepares for the examples that follow. Consider a domain in dimension D, 
and discretize it with a uniform finite-difference grid with nodes {xP} ^ and spacing h. By discretizing 
the Laplacian in the Hamiltonian using finite-differences of chosen order, we obtain the matrix eigenvalue 
problem 

H</>„ = A n <0„ , n = l,2,...,N d (58) 

where ip n = [^ n ,i^n,2i ■ ■ ■ ■> ipn,N d ] T is the n th eigenvector or orbital and i/] njP is the value of the orbital at 
note p. The Lanczos iteration (Eqn. 50) takes the form 



+1 



V-l 



(H 

M 
o 



afc+i)v fc 



v 



i 



1, k = 0,l,...,K -1 



(59) 



where 



«fc+i = v fc Hv fc , k = 0, 1, . . . ,K - 1 



(60) 



and bf. is computed such that ||vfe| 



1. For the p finite-difference node, only the p component of the 
initial vector vo is nonzero and is set to unity. The calculation of the spectral quadrature nodes {\ p k }^ =1 and 
weights {w^}£ =1 for each node p G [1, N^] proceeds as described in Section 4.2. Importantly, note that 
these values may be computed independently at each node. Once the spectral quadrature nodes and weights 
are known, we evaluate the Fermi energy by solving for the constraint 



A'rf 



N P = h 



D 



(61) 



where 



K 



(62) 



k=l 



is the electron density at the p th finite-difference node. Furthermore, the band structure energy and entropy 
can be evaluated 



N d 



Uband 



p=l 

N d 



(63) 



(64) 
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where 



K 



* = a^X^C^A,), (65) 

k=l 

K 

s p = -2k B ^wl[ 9 (Xl,Xf)logg(X p k ,Xf) + (l-9(K,Xf))log(l-9(X p k ,Xf))}- (66) 

k=l 

The above equations allow us to intepret u p and s p as the nodal band structure energy and nodal entropy. 

4.5. Numerical Examples and Validation 

In this section, we proceed to verify the LSSGQ method through selected examples. Our first example 
concerns a one-dimensional model problem proposed by Garcia-Cervera et al. (2009). Since the problem 
shares many of the features of the linearized Kohn-Sham problem, it provides a convenient test of the 
accuracy and performance of the LSSGQ method. In our second example, we apply the LSSGQ method to 
the nonlinear three-dimensional Kohn-Sham problem and evaluate selected crystal properties. Finally, we 
showcase the scope and versatility of the LSSGQ method by studying the phenomenon of surface relaxation. 

4.5.1. One-dimensional model problem 

Consider a one-dimensional chain of atoms positioned with unit spacing at {Rj}^ =v Let the effective 
potential due to an atom at Rj be given by (Garci'a-Cervera et al., 2009) 

and therefore the total potential at any point is given by V(x) = Ylj=i Vj(x). Here, a and (3 represent the 
depth and width of the wells respectively. The ground-state properties are determined through the eigenvalue 
problem 

Uil>n = X n ip n , n = l,2... (68) 

where the Hamiltonian is given by 

The Fermi energy, electron density and energy are calculated using the relations 

N d 

N e = J^An.A/), (70) 

n=l 

N d 

p(x) = ^g(A n , Xf)\ifj n (x)\ 2 , (71) 

71=1 

N d 

£ a = ^2g(X n ,X f )X n . (72) 

n=l 

We introduce a 12 th order accurate finite-difference approximation for the Laplacian and use the LSSGQ 
method described in Section 4.4. By appropriately choosing the parameters a and fj, we can vary the band 
gap of the system, thereby choosing between insulating, semiconducting and metallic systems. Further 



16 



details can be found in Garcfa-Cervera et al. (2009). We consider two sets of parameters (a, (3) = (100, 0.3) 
and (a, (3) = (10, 0.45), which we designate as insulator and metal, respectively. We also study the effect of 
temperature on the efficacy of the method, by considering two extreme cases corresponding to a = 0.0001 
and a = 1.0. In the results presented below, the error in energy is defined as the difference between the 
energies obtained by the LSSGQ method and diagonalization. Similarly, error in electron density is the I? 
norm of the difference in electron densities obtained on using the LSSGQ method and diagonalization. We 
note that all errors have been normalized by the corresponding quantities obtained via diagonalization. 

~N on-periodic calculations. For these calculations, we choose a chain of 101 atoms, i.e. M = 101. The 
convergence of the LSSGQ method with respect to the number of spectral quadrature points for both metals 
and insulators is presented in Fig. 1 . The rapid convergence of the energy for both metals and insulators is 
clear from the figure. For insulators, the convergence is independent of temperature, a consequence of the 
presence of a band gap. By contrast, convergence for metals is significantly accelerated on increasing the 
temperature. 



0.5 




Number of spectral quadrature points (K) 

(a) £7 = 1 



Figure 1: Convergence of the LSSGQ 
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Number of spectral quadrature points (K) 

(b) a = 0.0001 
for the one-dimensional model problem 



Next, we introduce a defect by removing the center atom and evaluate the defect energy as well as 
the defect electron density. The defect energy and defect electron density are defined as the difference in 
energies and electron densities between systems with and without the defect. Accurately predicting these 
quantities is challenging, as it requires the calculation of relatively small differences. The results so obtained 
are presented in Fig. 2. For the insulator, there is rapid convergence with number of spectral quadrature 
points and the convergence is independent of the temperature. However, for the metal we notice that the 
number of spectral quadrature points required is dependent on the temperature, with fewer points required 
on increasing the temperature. 

Periodic calculations. Consider next an infinite chain of atoms with unit spacing. We define a two atom unit 
cell as the representative volume Qrv- Since the potential decays exponentially, we evaluate the potential 
inside Qrv by considering the contribution of atoms which are within a cutoff radius. We then follow the 
procedure outlined in Appendix B to evaluate the Fermi energy, electron density and energy per atom. The 
results so obtained are presented in Fig. 3. As observed in the non-periodic calculations, convergence in 
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Figure 2: 
problem 



Convergence of the defect energy and defect electron density for the one-dimensional model 



insulators is independent of temperature, whereas for metals convergence is significantly accelerated with 
increasing temperature. 

Performance and Scaling. We now examine the rate of convergence as well as the scaling properties of the 
LSSGQ method. First, we perform non-periodic calculations on a chain of 101 atoms (M = 101). We plot 
the convergence in energy with respect to the number of spectral quadrature points in Fig. 4. It is clear that 
we get spectral convergence, as is to be expected, since Gauss quadrature is itself a spectrally convergent 
method. 

Next, we study the scaling of the LSSGQ method in terms of the computational time taken versus the 
system size. We perform non-periodic calculations for metallic systems ranging from 1 atom to 100, 000 
atoms, all on a single computer. This has been showcased in Fig. 5, which highlights that the LSSGQ method 
is indeed linear-scaling. It should be noted that such large systems are intractable for orbital formulations, 
especially when solving on a single computer. 
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Figure 3: Convergence of the LSSGQ method for the periodic one-dimensional model problem 
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Figure 4: Spectral convergence of the LSSGQ method 



4.5.2. Kohn-Sham problem 

In the previous section, we have verified the LSSGQ method by means of a one-dimensional chain of 
atoms interacting through a model potential. In our next example, we apply the LSSGQ method to the 
solution of the Kohn-Sham problem using the LDA (Kohn and Sham (1965)) for the exchange correlation 
energy, and the 'Evanescent Core' pseudopotential approximation (Fiolhais et al., 1995). Specifically, we 
evaluate the bulk properties of body centered cubic (BCC) crystals of sodium, lithium and study the phe- 
nomenon of (001) surface relaxation of BCC sodium. 

In our calculations, we utilize the procedure outlined in Algorithm 1 to solve for the ground-state proper- 
ties. Specifically, we use a 6 th order accurate finite-difference discretization of the Laplacian. We solve for 
the equilibrium atomic positions through the BFGS quasi-Newton method with a cubic line search procedure 
(cf., e. g., Vogel (2002)). For the SCF method, we use the generalized Broyden method (Fang and Saad, 
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Number of atoms 



Figure 5: Linear-scaling nature of the LSSGQ method 

2009) for convergence acceleration. We handle the periodicity in the system for the eigenvalue problem 
through the procedure outlined in Appendix B. We calculate the Fermi energy by using a combination of bi- 
section, secant and inverse quadratic interpolation methods (Forsythe et al., 1973) to solve Eqn. 52. We solve 
the Poisson equation (Eqn. 9) with the generalized minimal residual method (GMRES - Saad and Schultz 
(1986)), wherein we evaluate the charge density 6(x, R) by employing a prespecified radius. In order to 
evaluate the free energy of the system, we need to perform spatial integrations. To this end, we employ the 
trapezoidal rule, wherein each quantity is assumed to be constant in a cube of side h around each finite- 
difference point. Here, h denotes the spacing of the finite-difference nodes. Finally, we choose a = 0.8 eV 
for all our calculations and extrapolate to find the ground-state energy at absolute zero using Eqn. 27. 

Crystal properties. We begin by evaluating the bulk properties of BCC sodium and lithium. We designate 
the BCC unit cell as the representative volume and use periodic boundary conditions. First, we verify the 
convergence of the LSSGQ method. To do so, we plot the binding energy per atom as a function of the 
number of spectral quadrature points (K) in Fig. 6. It is clear that we obtain rapid convergence, highlighting 
the efficacy of the LSSGQ method. Next, we plot the binding energy per atom as a function of lattice 
constant in Fig. 7. Using a cubic fit to this data, we calculate the cohesive energy, equilibrium lattice 
constant and bulk modulus. The results so obtained are presented in Tables 1 and 2 with a comparison to 
previous results obtained by Fiolhais et al. (1995). There is good agreement in the cohesive energy and the 
lattice constant. However, there is a reasonable disagreement in the bulk modulus. The fact that we obtain 
a smooth curve for the binding energy versus the lattice constant (Fig. 7) gives us confidence in our results. 
Nevertheless, the discrepancy warrants further investigation. 

Surface relaxation. Next we study the (001) surface relaxation of BCC sodium. We choose the represen- 
tative volume (Qrv) to be a rectangular cuboid of dimensions a x a x h, where 'a' is the lattice constant. 
Furthermore, h = h vac + h ce u, where h vac = n\a and h ce u = ri2a (n\, r&2 E Z) are the heights of vacuum 
and material included in Qrv, respectively. We use periodic boundary conditions in the x±,X2 directions. 
In addition, we use zero Dirichlet boundary conditions on the top face of ft^y and assume that the perfectly 
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Figure 6: Convergence of the LSSGQ method for evaluating the crystal properties 
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Figure 7: Binding energy as a function of lattice constant 



Table 1 : Crystalline properties of BCC sodium 



Property 


LSSGQ 


Fiolhais et al. (1995) 


Experiment (Fiolhais et al., 1995) 


Cohesive energy (eV/atom) 


-1.03 


-1.02 


-1.04 


Lattice constant (a. u.) 


8.01 


8.21 


8.21 


Bulk modulus (GPa) 


5.0 


7.1 


7.3 



periodic solution is attained in the bottommost unit cell of £Irv- We relax the atoms only along the X3 
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Table 2: Crystalline properties of BCC lithium 



Property 


LSSGQ 


Fiolhais et al. (1995) 


Experiment (Fiolhais et al., 1995) 


Cohesive energy (eV/atom) 


-1.21 


-1.31 


-1.0 


Lattice constant (a. u.) 


6.87 


6.77 


6.77 


Bulk modulus (GPa) 


10.0 


14.0 


13.3 



direction. We evaluate the surface energy using the relation 

'-'surface o v'-'/ 

1 a 2 

where £q rv is the total energy of finy £coh is the cohesive energy for a perfect crystal and N e is the number 
of electrons in Qrv- 

We begin by verifying the convergence of the calculations with respect to the number of spectral quadra- 
ture points (K) in Fig. 8. The rapid convergence, highlighting the efficacy of the method, is noteworthy from 
the figure. The calculated surface energy and the displacement of the atoms are collected in Table 3. The 
calculated surface energy is in good agreement with previous values in literature (Vitos et al., 1998). Note 
that the forces on the atoms in the third layer and beyond are below the threshold value used for the cal- 
culations. Also, the convergence of the calculated quantities with respect to size of Qrv has been verified. 
Finally, the electron density contours on the mid plane and edge plane are plotted in Fig. 9 for purposes of 
illustration. 




Number of spectral quadrature points (K) 
Figure 8: Convergence of LSSGQ method for the (001) surface relaxation of BCC sodium 



5. Coarse-Graining Density Functional Theory 

Defects in crystalline solids involve elastic and electrostatic fields that may decay extremely slowly. 
Therefore, accurate computations of such problems require very large computational domains, and even 
linear-scaling methods can be insufficient to fully understand them. However, the slow decay offers an 
opportunity to approximate the problem in a manner that reduces the computational complexity with limited 
loss of accuracy. In this section, we develop a method wherein we solve defects (using DFT) without the 
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Table 3: Properties of the (001) surface of BCC sodium 



Surface energy 0.2 J/m' 2 
Displacement of atoms 

First layer 0.75 a. u. 

Second layer -0.19 a. u. 




(a) Mid plane (b) Edge plane 

Figure 9: Electron density contours on slices through a (001) surface of BCC sodium 

introduction of additional equations or uncontrolled approximations, but with substantially reduced effort. 
This method is in the spirit of the quasicontinuum approximation used in atomistic models (Tadmor et al. 
(1996)) and OFDFT (Gavini et al. (2007)), but also has important differences. For definiteness, we discuss 
the method in terms of the finite-difference approximation scheme. We formulate the coarse-grained DFT 
(CGDFT) approach in Section 5.1 and validate it through examples in Section 5.2. 

5.7. Formulation 

Consider the implementation of the LSSGQ method within the context of finite-differences (Section 
4.4). Let us denote the collection of all finite-difference nodes {x p } ^ by M. In the finite-difference 
approximation, iV e , Ub an d and S can be represented as the sums of nodal values (cf. Eqns. 61, 63, 64): 

N d N d N d 

p=i p=i p=i 

where u and s are used to denote the pointwise band structure energy and pointwise entropy respectively. 
Further, the nodal values (P , u p and s p depend only on values of the spectral quadrature points {\ p k }^ =l and 
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weights {w?}£ =1 associated with the p th node. (cf. Eqns. 62, 65, 66) 



(75) 



fc=i 



K 



(76) 



k=l 



K 



2k B £ w{[g{\l, A/) log <?(A^, A/) + (1 - g(\ p k ,\f)) log(l - ff(Af , A/))]. (77) 



A-l 



Finally, recall that we can compute {\ p k }^ =1 and {wu}k=i associated with a particular node independently 
of all the other nodes. This observation provides the key idea behind coarse-graining. 

When a defect is introduced in an otherwise perfect lattice, we expect complex fluctuations in the elec- 
tron density and other fields in the vicinity of the defect, but relatively smooth decay away from the defect. 
Specifically, we expect the atomic displacements to decay in some polynomial fashion and the perturbation 
in the electron density, pointwise band structure energy and pointwise entropy to decay to zero away from 
the defect. We wish to take advantage of this long range decay structure to develop an accurate reduced 
representation. Below, we discuss in detail how this is achieved. 

5.1.1. A tomic positions 

We select representative atoms and interpolate the displacement for the remaining atoms as is done in 
the quasicontinuum method (cf., e. g., Tadmor et al. (1996); Knap and Ortiz (2001)). The density of the 
representative atoms is chosen so as to be high in the vicinity of the defect and to decrease with increasing 
distance to the defect. This varying resolution can be achieved through a variety of adaptive refinement 
schemes (Tadmor et al., 1996; Knap and Ortiz, 2001). 

5.1.2. Electron density, band structure energy and entropy 

Since we expect the perturbation (due to defect) in the electron density, pointwise band structure energy 
and pointwise entropy to decay to zero, we seek an approximation scheme with the following decomposition 



where po, uq and so are piecewise periodic, while pj, Ud and Sd decay in a smooth fashion away from the 
defect, po, uq and so can be determined by an inexpensive periodic calculation. Since pd, Ud and Sd decay 
smoothly, we can develop an adaptive method for computing them. To this end, we pick representative 
nodes, the collection of which we denote by 1Z. The density of finite-difference nodes in 7Z is chosen to 
be high in the vicinity of the defect and to progressively decrease with increasing distance from the defect. 
We then use the LSSGQ method to evaluate the spectral Gauss quadrature points and weights only at the 
representative nodes {x p ,p € 7Z}. We then evaluate the Fermi energy by solving for the constraint (Eqn. 
74) with the interpolation scheme 



for the nodes {x p ,p G M — TZ}. 7q are weights decided by the degree of interpolation/approximation used. 
For the examples in the next section, we use cubic spline interpolations to determine the weights (Knott 
(2000)). Once we locate the Fermi energy, we calculate the electron density at all the finite-difference nodes 



P = Po + Pd, u = u + u d , s = s + s d 



(78) 




(79) 
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using Eqn. 75 for the nodes {x p , p G 71} and Eqn. 79 for the nodes {x p , p G J\f — 7Z}. The free energy can 
be determined by evaluating the nodal band structure energy, nodal entropy using Eqns. 76, 77 and using 
the interpolation 

qeiz 

s P = ^ + E*"- s o)- (81) 

for nodes {x p , p 6 N — 1Z). Since we expect p^, and to smoothly decay to zero away from the defect, 
we can use fewer and fewer representative finite-difference nodes away from the defect to capture these 
perturbations. As a consequence, the method will display sub-linear scaling with respect to the number of 
atoms. Therefore, this method opens the possibility of studying systems of much larger size compared to 
the linear-scaling LSSGQ method. 

5.1.3. Electrostatic potential 

We write the electrostatic potential 

4> = 4>o + 4>d (82) 
and evaluate the perturbation tp^ by solving the equation 

- -r- V 2 ^ = p - po + b - bo. (83) 

4-7T 

By its very nature, 4>d is localized near the defect and decays to zero away from it. Therefore, it can be easily 
calculated by using any adaptive discretization tailored to this feature. 

We summarize the CGDFT method in Algorithm 2. 



5.2. Validation 

In this section, we verify the CGDFT method by applying it to a one-dimensional model problem and a 
Kohn-Sham surface relaxation problem, in Sections 5.2.1 and 5.2.2 respectively. 

5.2.1. One-dimensional model 

We study the efficiency of CGDFT in the framework of the one-dimensional model presented in Section 
4.5.1. Consider a chain consisting of 101 atoms with unit spacing. We introduce a point defect by removing 
the center atom. We coarse-grain in both directions from the position of the defect and adopt the procedure 
described in Section 5. 1 to solve for the electron density and the energy. We plot the error in the defect energy 
and defect electron density so obtained, as a function of the number of representative nodes in Fig. 10. The 
error is measured and normalized with respect to the fully resolved LSSGQ solution. It is clear that we get 
rapid convergence in the solution for both metals and insulators. As expected, convergence in the case of 
insulators is much more rapid owing to the highly localized nature of the defect perturbation. Therefore, 
CGDFT can be utilized to solve the defect problem at a small fraction of the original computational cost, 
without any significant loss of accuracy. 
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Algorithm 2: CGDFT method 



Generate guess for positions of the representative atoms (R) 
repeat Relaxation of representative atoms 
Calculate charge density of the nuclei (6) 
Generate piecewise periodic po an d (f>o 
Generate guess for electron density (p) 
repeat Self-Consistent loop: SCF 

Calculate electrostatic potential (0) by solving Eqn. 83 
Calculate exchange correlation potential (V xc ) 

Calculate spectral quadrature points and weights at representative finite-difference nodes 
Calculate Fermi energy (A/) using Eqn. 74 (utilizing interpolation given by Eqn. 79) 
Calculate electron density at representative nodes (fP,p S 1Z) using Eqn. 75 
Interpolate the electron density to nodes (p p ,p G N — 1Z) using Eqn. 79 
Update the electron density (mixing) 
until Convergence of self-consistent iteration; 
Calculate the forces on the nuclei using Eqn. 21 
until Energy minimized with respect to positions of representative atoms', 
Evaluate free energy F using Eqns. 74, 75, 76, 77, 80 and 81 for the band structure energy and 
entropy. 




'.12 



Size of ft/Size of N Size of ft/Size of N 

Figure 10: Convergence of the CGDFT method for the one-dimensional model problem with a point defect 



5.2.2. Kohn-Sham problem: Surface relaxation 

We now study the (001) surface relaxation of BCC sodium using CGDFT, previously analyzed using the 
LSSGQ method in Section 4.5.2. We employ the procedure described in Section 4.5.2 with the introduction 
of coarse-graining described in Section 5.1. However, we do not introduce coarse-graining for the electro- 
static potential since the computational time involved in its calculation is negligible. We compare the results 
obtained using CGDFT with the fully resolved solution obtained using the LSSGQ method. A summary of 
the results is presented in Table 4. We can see that there are significant computational savings when using 
CGDFT at no appreciable loss of accuracy. It should be carefully noted that the surface relaxation problem is 
effectively one-dimensional for purposes of coarse-graining. As a consequence, the computational savings 
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accrued are likely to be less substantial than in fully two or three dimensional problems. 



Table 4: Summary of the results for CGDFT 



Number of representative nodes/Total number of nodes 
Number of representative atoms/Total number of atoms 
Normalized error in surface energy 
Normalized error in displacement of atoms 



0.25 
0.3 

0.12 % 



First layer 



0.11 % 
0.17 % 



Second layer 



6. Conclusions 

We have developed a real-space formulation for coarse-graining DFT (CGDFT). Traditional implemen- 
tations of DFT solve for the the orbitals, which results in cubic-scaling with respect to the number of atoms. 
Furthermore, they are not amenable to coarse-graining because of the global nature of the orthonormal- 
ity constraint. In order to overcome this difficulty, we have developed a linear-scaling method for DFT 
(LSSGQ), where we directly evaluate the electron density without evaluating the individual orbitals. We 
accomplish this direct evaluation by performing Gauss quadrature over the spectrum of the linear Hamil- 
tonian operator in each iteration of the SCF method. As a second approximation step, we have appended 
coarse-graining approximations to the LSSGQ method. Specifically, we decompose the solution into a pe- 
riodic component and the perturbation caused by lattice defects. Since the perturbation is expected to decay 
rapidly away from the defect, fine resolution in only needed in the vicinity of the defect and the resolution 
can be coarse-grained elsewhere. This coarse-graining enables the analysis of defects at a fraction of the 
computational cost, without any appreciable loss of accuracy. We have validated the LSSGQ and CGDFT 
methods through a number of examples of application, the results of which are in good agreement with lit- 
erature values. Furthermore, we have verified the convergence of the CGDFT solution to the fully resolved 
solution. These verification and validation examples, taken together, show CGDFT to be highly transferable, 
efficient and accurate. 

We close by pointing to some outstanding issues and limitations of the CGDFT method and to possible 
extensions of the method thereof. A principal shortcoming of the implementation described in this paper 
is the non-automated nature of the coarse-graining approximations. An optimal and automated coarse- 
graining technique for CGDFT would greatly extend the range, scope and robustness of the method. In 
addition, a Rayleigh-Ritz or Galerkin spatial discretization scheme, e. g., based on finite elements, would 
endow the approximation scheme with a variational structure, thereby opening the way to energy-based 
adaptive methods. In addition, the convergence of variational approximation schemes can be established by 
means of powerful tools such as T-convergence. However, it should be noted that finite-element bases lack 
orthogonality in general and result in generalized eigenvalue problems. This property may potentially add 
to the computational cost and implementational complexity associated with the method. These issues are 
currently being investigated by the authors. 
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A. Pade approximation and recursion method 

The recursion method (Hay dock, 1980) has found many applications in the physics literature. Here, we 
discuss the close connection between the LSSGQ and Recursion methods. In doing so, we also establish 
their relationship to the Pade approximation. Details of the notation not described here can be found in 
Sections 3 and 4. 

Let TZy.{ z ) = {zX — represent the resolvent of the operator % for z € p(H) = C \ a{%) 

where p(J-L) denotes the resolvent set of H and C is the set of all complex numbers. TZ^(z) is holomorphic 
Vz G p(U). For z € p{H), C G H it follows 

TZ H (z) = [ — ^— - d£ (A) , (84) 

Ja(H) Z ~ A 

G u (z) = (K H (z)(,() = I — — r d£f,f(A). (85) 

Ja(H) 2 ~ A 

Taking the series expansion of G^^(z) around the point z = oo, we obtain 

= E^n/ A fc ^ c , c (A) = ]T 



OO 



k=0 " u k=0 



z k+l 



(86) 



which converges for \z\ > 

Diagonal Pade approximants for G^^(z) are rational functions of the form qK{z)/pK{z) which satisfy 
the following property 

and are locally the best rational approximations for a given power series like Eqn. 86 (cf., e. g., Suetin 
(2002)). The denominator polynomials pk(z), are a set of orthogonal polynomials with respect to the 
measure (cf., e. g., Moren and Branquinho (2008)) 



/ 



b 

\ k PK (\)d£ c , c (\) = 0, k = 0,l,...,K -1 (88) 



and the numerator polynomial qx{z) can be expressed in terms of pk{z) as follows 

/ \ f b Pk(z) -pk(X) j C ... 

q K {z)= \ ^C,C( A )- ( 89 ) 

J a z — A 

The Pade approximants are constructed using the coefficients of the power series and provide an efficient 
analytic continuation of the series beyond the domain of convergence, which in the present case is given 
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by \z\ > \\H\\. The convergence of diagonal Pade approximants follows from Markov's theorem (cf., e. g., 
Suetin (2002)), whereby any Markov function like G^^(z) can be recovered in p(H) from the coefficients 
of the Laurent expansion of the function at the point z = oo (i.e. from the moments of the measure £f 

The orthogonal polynomials {p k } k=1 can ^ e generated via the recurrence relation (cf., e. g., Golub and Meurant 
(2010)) 



pjfc + i(A) = (A - a k+1 )p k (X) - 6|p fc _i(A) , k = 0, 1, . . . ,K - 1 

p_i(A)=0, p (A) = l 



(90) 



where 



a-k+i 



(Pk,Pk)c 
(Pk,Pk)( 



k = 0, 1,... ,K- 1 
k = 1,2,...,K-1. 



(91) 



Using Eqns. 88 and 89, we see that {q k } k= i can b e generated with a similar recurrence relation as that for 
{pk}k=i but with different initializing conditions 



%+i(A) = (A - a fc+ i)g fc (A) - b\q k -i{\) , k = 0, 1, . . . , K - 1 

<7_i(A) = -l, c?o(A) = Oand b 2 = 1. 

Corresponding to these recurrence relations we have the tridiagonal matrix 

\ 



(92) 



Jk 



( at 1 

b\ a 2 



V 



1 



b 2 K _ 2 a-K-i 1 

&K-1 a K J 



(93) 



Expanding &et{zl k — J k ) with respect to its last row, we obtain the recurrence formula for the determinant 

det(z4 + i - J k+1 ) = (z - a k+1 )det(zl k - J k ) - b k det(zl k -i - J k -i) , k = 0, 1, . . . , K - 1 

det(z/_i - J_i) = , det(>lo - J ) = 1 (94) 

which is the same as that satisfied by {p k } k =Q as well as {det(zl k — J k )} k =Q- Consequently 

Pk(z) = det(zI K - Jk) = det(zI K - Jk)- (95) 

Similarly 

q K (z) = det(zlK-i - Jk-i) = det(zI K -i - Jk-i) (96) 

where the superscript i is used to denote the matrix obtained by deleting the first i rows and columns. 
Therefore 

Qk{z) detjzlK-! - J\^) f x 

— r-r = — — — x—— = ei(zI K -JK) &i- (97) 

Pk{z) det(zI K - J K ) 

Above, the first column of Ik is denoted by e±. 
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Expanding det(zlK — Jk) about the first row or column we obtain from Eqn. 97 



Pk{z) det(zI K -2 ~ J? 



K-2) 

z — a\ 



det(zI K -2 - J K _ 2 ) 

and continuing similarly we obtained a continued fraction representation for the Pade approximant 

Qk(z) 1 

Pk(z) b\ 



(99) 



Ol 



02 



z — ax 



In the recursion method, the above expression is used to find the projected density of states (Haydock, 1980) 
from which all the necessary quantities are evaluated. Since the projected density of states has poles at the 
zeros of pk{z), a number of techniques to smoothen it have been developed. These include either using 
a terminator, simplest one being {afc}£L^ +1 = a °°> {^k}'k = K = or evaluating the Pade approximant 
slightly away from the real axis. However, these are uncontrolled approximations and can sometimes lead 
to inaccuracies. 

Finally, we look at the connection between Pade approximation and Gauss quadrature (cf., e. g., Van Assche 
(2006)). Multiplying both sides of Eqn. 87 by a polynomial of degree at most 2K — 1 denoted by tt2k~i(z), 
integrating along a contour T encircling the real line and subsequently using Eqn. 85 we obtain 

rb K 

(100) 



/ 7T2ftr-l(A)d£'^ ?? (A) = ^ Wjir 2 K-l(^j 
J a ■ t 



where 

i - ^ (101) 

is the residue of the Pade approximant at the zeros \j of px- It is clear that all polynomials of degree 2K — 1 
are integrated exactly, therefore establishing the connection between the Pade approximation, recursion 
method and the LSSGQ method. 



B. LSSGQ method applied to systems with periodicity 

In this section, we discuss the application of the LSSGQ method to the study of periodic systems. 
Typically, this would involve the need for Bloch periodic boundary conditions on the orbital. However, 
since we directly evaluate the electron density and not the individual orbitals, we can circumvent the need 
for Bloch periodic boundary conditions using the procedure described below. For clarity, we discuss the 
procedure in terms of the finite-difference approximation. However, the method is not restricted by the 
choice of basis. 

First, we specify a representative volume Qrv which has the property that the solution for the entire 
system can be ascertained through a translational mapping of the solution obtained in Qrv- We use a finite- 
difference grid of uniform spacing h to discretize Qrv, which we denote by XRV- Next, we define an 
extended volume Qev 3 ^ilV> which is discretized using a finite-difference grid of spacing h, denoted by 
Xev ^ XRV- We evaluate the spectral quadrature nodes and weights for xrv using the procedure outlined 
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in Section 4.4. It is important to note that the size of Q ev is chosen such that all the vectors in the Lanczos 
iteration given by Eqn. 59 have nonzero components only for finite-difference nodes inside Qev- Since the 
initial vector vo always has a single nonzero entry, the required size of Q ev can be easily calculated using 
the bandwidth of the discretized H and the number of spectral quadrature points used. 
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